Davem/xspod #23795

iabyn · 2025-10-02T11:17:26Z

Rewrite perlxs.pod

This branch completely rewrites and modernises the XS reference manual,
perlxs.pod.

The new file is about twice the size of the old one.

This branch:

deletes some obsolete sections;
reorders the existing sections into a more logical order;
adds a large new introductory/overview part, which explains
all the background needed to understand what XSUBs do, including
SVs, the stack, reference counts, magic etc.
includes a BNF syntax section
modernises: e.g. it uses "ANSI" parameter syntax throughout
has a fully-worked example using T_PTROBJ

Note that although each commit in this branch may have a complex-looking
diff for the updating of a particular section, in reality most sections
haver been rewritten from scratch, and the diff output is showing
paragraph breaks as fixed unchanging points, so that it appears as lots of
individual paragraph changes rather than "delete all this text, add new
text". If reviewing, it may be easier to just read the final perlxs.pod
file instead of looking at the diffs.

This set of changes requires a perldelta entry, and I will write one later

The XS parser supported an extremely obscure bit of functionality which made use of the %v package variable to maintain state between different bits of typemap processing. This was accidentally broken in 5.10.0: refactoring removed the 'use vars "%v"' line, and no one seemed to notice or care. Also, the sole example of its use in the docs seemed to be obscure, confusing and probably wrong. There was a consensus in the discussion at http://nntp.perl.org/group/perl.perl5.porters/267667 that we should stop documenting this feature rather than trying to fix it.

The various XS code examples had odd and inconsistent indentation (often with 5 leading spaces) and inconsistent formatting, e.g. foo(a,b) vs foo( a, b ) vs foo(a, b). Fix that, and also remove any tab chars. Whitespace-only change.

This commit is a simple cut which deletes several '=head2' sections from perlxs.pod. The next commit will tidy up and fix any broken links etc. These sections are more tutorial-like, and aren't in line with the goal of this branch that perlxs.pod becomes purely a reference manual for XS. Any relevant information from these sections may be incorporated later into new sections in perlxs.pod and/or be included in a future rewrite of perlxstut.pod. The sections deleted are: =head2 Introduction =head2 On The Road =head2 The Anatomy of an XSUB =head2 The Argument Stack =head2 The RETVAL Variable =head2 Returning SVs, AVs and HVs through RETVAL =head2 Returning Undef And Empty Lists =head2 Interface Strategy =head2 Perl Objects And C Structures

The previous commit deleted several sections from perlxs.pod. This commit fixes things up; done as a separate commit so that the changes aren't drowned out in the diff listing.

This big commit does a series of plain cut+pastes to reorder all the =head2 sections within the file. This changes the order from semi-random into roughly the order the various XS keywords would appear within an XS file, and then within an XSUB declaration/definition. No changes have been made to the text: simply that all lines from a particular '^=head2' up until the next head2 have been cut+paste as a single unit. No attempt has been made yet to make the text consistent with the new ordering; that will be done by the subsequent commits of this branch. The previous ordering in this file was: =head1 NAME =head1 DESCRIPTION =head2 The MODULE Keyword =head2 The PACKAGE Keyword =head2 The PREFIX Keyword =head2 The OUTPUT: Keyword =head2 The NO_OUTPUT Keyword =head2 The CODE: Keyword =head2 The INIT: Keyword =head2 The NO_INIT Keyword =head2 The TYPEMAP: Keyword =head2 Initializing Function Parameters =head2 Default Parameter Values =head2 The PREINIT: Keyword =head2 The SCOPE: Keyword =head2 The INPUT: Keyword =head2 The IN/OUTLIST/IN_OUTLIST/OUT/IN_OUT Keywords =head2 The C<length(NAME)> Keyword =head2 Variable-length Parameter Lists =head2 The C_ARGS: Keyword =head2 The PPCODE: Keyword =head2 The REQUIRE: Keyword =head2 The CLEANUP: Keyword =head2 The POSTCALL: Keyword =head2 The BOOT: Keyword =head2 The VERSIONCHECK: Keyword =head2 The PROTOTYPES: Keyword =head2 The PROTOTYPE: Keyword =head2 The ALIAS: Keyword =head2 The OVERLOAD: Keyword =head2 The FALLBACK: Keyword =head2 The INTERFACE: Keyword =head2 The INTERFACE_MACRO: Keyword =head2 The INCLUDE: Keyword =head2 The INCLUDE_COMMAND: Keyword =head2 The CASE: Keyword =head2 The EXPORT_XSUB_SYMBOLS: Keyword =head2 The & Unary Operator =head2 Inserting POD, Comments and C Preprocessor Directives =head2 Using XS With C++ =head2 Safely Storing Static Data in XS =head3 MY_CXT REFERENCE =head1 EXAMPLES =head1 CAVEATS =head2 Use of standard C library functions =head2 Event loops and control flow =head1 XS VERSION =head1 AUTHOR DIAGNOSTICS =head1 AUTHOR and is now: =head1 NAME =head1 DESCRIPTION =head2 The MODULE Keyword =head2 The PACKAGE Keyword =head2 The PREFIX Keyword =head2 Inserting POD, Comments and C Preprocessor Directives =head2 The REQUIRE: Keyword =head2 The VERSIONCHECK: Keyword =head2 The PROTOTYPES: Keyword =head2 The EXPORT_XSUB_SYMBOLS: Keyword =head2 The INCLUDE: Keyword =head2 The INCLUDE_COMMAND: Keyword =head2 The TYPEMAP: Keyword =head2 The BOOT: Keyword =head2 The FALLBACK: Keyword =head2 The NO_OUTPUT Keyword =head2 The IN/OUTLIST/IN_OUTLIST/OUT/IN_OUT Keywords =head2 Default Parameter Values =head2 The C<length(NAME)> Keyword =head2 Variable-length Parameter Lists =head2 The PREINIT: Keyword =head2 The INPUT: Keyword =head2 The NO_INIT Keyword =head2 Initializing Function Parameters =head2 The & Unary Operator =head2 The SCOPE: Keyword =head2 The INIT: Keyword =head2 The C_ARGS: Keyword =head2 The CODE: Keyword =head2 The PPCODE: Keyword =head2 The POSTCALL: Keyword =head2 The OUTPUT: Keyword =head2 The CLEANUP: Keyword =head2 The PROTOTYPE: Keyword =head2 The OVERLOAD: Keyword =head2 The ALIAS: Keyword =head2 The INTERFACE: Keyword =head2 The INTERFACE_MACRO: Keyword =head2 The CASE: Keyword =head2 Using XS With C++ =head2 Safely Storing Static Data in XS =head3 MY_CXT REFERENCE =head1 EXAMPLES =head1 CAVEATS =head2 Use of standard C library functions =head2 Event loops and control flow =head1 XS VERSION =head1 AUTHOR DIAGNOSTICS =head1 AUTHOR

Following the previous commit's reordering of the all the =head2 sections, demote most of the =head2 headers to =head3, and add some new =head2 headers which group together related headers. Also add some =head3's for a few missing keywords. Subsequent commits will flesh out the new sections.

Four commits ago, I removed most of the general text sections in perlxs (i.e. the ones not specifically about a particular keyword). Now this commit adds a completely new introductory part to perlxs, about 1200 lines long. It represents an attempt to write a background to what XS and XSUBs, SVs, typemaps etc are, in a complete and modern way. The existing reference section for each keyword follows it. I tried to avoid getting too tutorial-like (that's what perlxstut is for), but I may have crossed the line in various places. In particular it has a new section which could have been titled "all the bits of perlguts you need to know in order to write non-trivial XSUBs without having to actually read perlguts".

Add a section which semi-formally tries to define the syntax and structue of an XS file, using a BNF-like format. See http://nntp.perl.org/group/perl.perl5.porters/268701 for the discussion of this part.

Rewrite the POD for these three keywords, and in particular, treat them as one declaration, rather than three unrelated keywords.

Populate the new =head2 File-scoped XS Keywords and Directives section, partially by cannibalising (and then deleting) the old =head3 Inserting POD, Comments and C Preprocessor Directives subsection. This commit only adds text about directives; subsequent commits will update the various file-scoped keywords.

Populate the new =head2 The Structure of an XSUB =head2 An XSUB Declaration sections

Add some initial text for this new section, and also add a new subsection "XSUB Parameter Placeholders".

Rewrite (and retitle) these three subsections: =head3 Default Parameter Values =head3 The C<length(NAME)> Keyword =head3 Variable-length Parameter Lists

Add text for the new '=head2 The XSUB Input Part' section, and rewrite the existing entry for the PREINIT keyword.

This commit completely rewrites this section and subsections: =head3 The INPUT: Keyword =head4 The NO_INIT Keyword =head4 Initializing Function Parameters =head4 The & Unary Operator It de-emphasises the INPUT keyword and suggests using ANSI XS signatures etc instead.

Add text for the new '=head2 The XSUB Init Part' section, and rewrite the existing entry for the INIT keyword.

Add text to the new =head2 The XSUB Code Part =head3 Auto-calling a C function sections, and rewrite the existing =head4 The C_ARGS: Keyword section

Rewrite these sections: =head3 The CODE: Keyword =head3 The PPCODE: Keyword

This keyword formerly wasn't documented. The docs now say "this is what it is, but don't use it".

Add text to the new =head2 The XSUB Output Part section, and rewrite the text in these existing sections: =head3 The POSTCALL: Keyword =head3 The OUTPUT: Keyword

Add text to the new =head2 The XSUB Cleanup Part section, and rewrite the text in this existing section: =head3 The CLEANUP: Keyword

Add text to the new =head2 XSUB Generic Keywords section, and rewrite the text in this existing section: =head3 The PROTOTYPE: Keyword

This keyword was undocumented, even though it had been added 25 years ago.

Populate the introduction to this new section.

Rewrite this section: =head3 The ALIAS: Keyword

Rewrite these sections: =head3 The INTERFACE: Keyword =head3 The INTERFACE_MACRO: Keyword also demote the second to be a head4 child of the first. Then expand the T_PTROBJ example to use INTERFACE as an alternative to ALIAS.

Rewrite this section: =head3 The CASE: Keyword

Populate this new section (except for the T_PTROBJ subsection, which had already been added by an earlier commit within this branch). Note that the "Common typemaps" subsection could probably benefit from some further expansion by someone familiar with which built-in T_FOO entries are useful.

Rewrite this section: =head2 Using XS With C++ Disclaimer: I've never written a proper C++ program. I had to (literally) dust off my 34-year old copy of Stroustrup(*) and also do some Googling. Hopefully what I've written is sane. (*) This was bought back in the days when people used to to learn things by buying books, and when I thought that I ought to know something about this newfangled C++ thing. I never got round to reading all of it: I discovered Perl around the same time, which looked to be a lot more fun.

Revise the text in this section: =head2 Safely Storing Static Data in XS

Rewrite this section: =head1 EXAMPLES Basically, delete the one big example in this section and instead provide links to various other examples already present in this document instead.

Tweak the final few sections of perlxs.pod.

tonycoz · 2025-10-06T04:41:01Z

dist/ExtUtils-ParseXS/lib/perlxs.pod

+L<perlcall>: this describes how to call Perl functions and do the
+equivalent of C<eval ""> from C.
+
+=item *
+
+L<perlembed>: this describes how to embed a complete Perl interpreter
+within another application.


perlcall barely mentions eval_pv, perlembed provides much better documentation for eval_pv/eval_sv.

tonycoz · 2025-10-06T04:51:10Z

dist/ExtUtils-ParseXS/lib/perlxs.pod

+
+There is a standard system typemap file which contains rules for common C
+and Perl types, but you can add your own typemap file in addition, and
+from perl 5.16.0 onwards you can also add typemap declarations inline


Shouldn't this refer to the version of EU::PXS that added embedded typemaps (3.01) instead of the perl version?

Practice throughout the document is inconsistent. There are numerous cases of “since 5.*” or “Perl 5.*”, but also some cases where both the Perl version and the ParseXS version are stated, e. g. in the sections SCOPE and ALIAS. And there’s at least one case where only the ParseXS version is stated (I forgot in which section).

I think trying to consistently state both versions might be useful: The ParseXS version because it’s technically the more relevant one here, and the Perl version because it’s more intuitive for many people.

I think trying to consistently state both versions might be useful: The ParseXS version because it’s technically the more relevant one here, and the Perl version because it’s more intuitive for many people.

No, I believe adding Perl versions will only add confusion here.

tonycoz · 2025-10-06T05:31:29Z

dist/ExtUtils-ParseXS/lib/perlxs.pod

+If you need to coerce an SV to a string (e.g. before modifying its string
+value) then use C<SvPV_force()> or one of its variants. For example if


Perhaps "before directly modifying its string buffer".

You don't need to force normal before calling sv_setpvn().

johannessen

I left some comments, mostly about minor things.

Overall, I found the document to be well structured and the prose to be easily readable, in spite of the fairly complex topic.

johannessen · 2025-10-06T21:54:27Z

dist/ExtUtils-ParseXS/lib/perlxs.pod

+in a particular C library, the XSUB definitions in the XS file are often
+just a couple of lines, consisting of a declaration of the name,
+parameters and return type. The XS parser will do almost all the heavy
+lifting for you,


s/you,/you./

johannessen · 2025-10-06T22:10:32Z

dist/ExtUtils-ParseXS/lib/perlxs.pod

+    require XSLoader;
+    XSLoader::load(__PACKAGE__, $VERSION);


XSLoader 0.14 (core since Perl 5.15) offers a simpler syntax:

XSLoader::load();

Given that the goal of this rewrite is to modernise the documentation, should this newer syntax be used in the example?

That automatically adds the module name but not the expected version. Personally I wouldn't use it..

perldoc XSLoader says:

A sanity check is done to ensure that the versions of the .pm and the (compiled) .xs parts are compatible. If $VERSION was specified, this is used for the check. If not specified, it defaults to $XS_VERSION // $VERSION (in the module's namespace)

So that sounds fine to me.

Right. The XSLoader documentation is correct. The default is applied by S_xs_version_bootcheck(), which is called during the handshake.

I think the overall recommendation for users should be to use no arguments in XSLoader::load(), for simplicity. So if there’s no particular strategy behind showing these arguments in the example code, I think they should be removed here.

You're right. It's just implemented completely differently from what I expected so I erroneously concluded that that didn't work.

johannessen · 2025-10-06T22:31:16Z

dist/ExtUtils-ParseXS/lib/perlxs.pod

+XSUB to be declared without raising a "duplicate XSUB" warning. This
+warning suppression only works for the if/else/endif form. For example
+this works:


I was, at first, confused by this text: From reading it, and the examples below it, it seemed like what doesn’t work is the #ifndef directive, specifically. Furthermore, the examples use “ifdef”, whereas this text uses “if”, which made it look like there’s a typo here.

After re-reading the old perlxs document, I understand presence of #else in particular is the key here. I think the new text isn’t very clear if you don’t already know that. How about something like this:

XSUB to be declared without raising a "duplicate XSUB" warning. This warning suppression only works when an else branch is present. For example, this works:

johannessen · 2025-10-06T22:33:35Z

dist/ExtUtils-ParseXS/lib/perlxs.pod

+    REQUIRE: 3.58
+
+The C<REQUIRE> keyword is used to indicate the minimum version of the
+C<ExtUtils::ParseXS> XS compiler (and its F<xsubpp> wrapper) needed to
+compile the XS module. It is expected to be a floating-point number of the
+form C<\d+\.\d+/>. It is analogous to the perl C<use v5.xx>.


-form C<\d+\.\d+/>. It is analogous to the perl C<use v5.xx>. +form C<\d+\.\d+/>. It is analogous to the perl C<use ExtUtils::ParseXS x.xx>.

Also, the version regex has a trailing / but not a leading /.

johannessen · 2025-10-06T22:37:13Z

dist/ExtUtils-ParseXS/lib/perlxs.pod

+The name of the XSUB is usually put on the line following the type, in
+which case it must be on column one. It is permissible for both the return
+type and name to be on the same line.


Very minor nit: I think those sentences would read better if the phrase “return type” was used first.

-The name of the XSUB is usually put on the line following the type, in -which case it must be on column one. It is permissible for both the return +The name of the XSUB is usually put on the line following the return type, +in which case it must be on column one. It is permissible for both the type and name to be on the same line.

johannessen · 2025-10-07T09:32:10Z

dist/ExtUtils-ParseXS/lib/perlxs.pod

+      OVERLOAD: \"\"

-Parameters preceded by C<OUTLIST> keyword do not appear in the usage
-signature of the generated Perl function.
+This could be regarded as a bug.


Does “bug” imply that it might get fixed, and the escaped syntax OVERLOAD: \"\" might become illegal in future?

This feels similar to an explicit OUTPUT: RETVAL being required for non-autocall usage, which is described as “probably a bad design decision, but we're stuck with it now” in the section on CODE above.

Should these two cases be described with similar language, or is there a real difference here with regards to possible future changes?

johannessen · 2025-10-07T09:55:31Z

dist/ExtUtils-ParseXS/lib/perlxs.pod

+object); and third, a swap flag. See L<overload> for the full details
+of how these functions will be called, with what arguments. Note that
+C<swap> can in fact be undef in addition to false, to indicate an assign
+overload such as C<+=>.


To detect the difference between swap being false and undef, you’d need to declare it as SV* and use SvOK() and SvTRUE(), right? Should this be pointed out, or is it obvious enough?

johannessen · 2025-10-07T10:17:02Z

dist/ExtUtils-ParseXS/lib/perlxs.pod

+C<ALIAS> which is more suited for autocall. Note that C<ALIAS> should not
+be used together with either of C<INTERFACE> or C<ATTRS>.


For symmetry with the section on OVERLOAD:

-be used together with either of C<INTERFACE> or C<ATTRS>. +be used together with either of C<ATTRS>, C<INTERFACE>, or C<OVERLOAD>.

johannessen · 2025-10-07T10:56:33Z

dist/ExtUtils-ParseXS/lib/perlxs.pod

+are converted back and forth via C<(UV)> casts. A few unsigned types such
+as C<I16> and C<U32> are instead mapped to C<T_U_SHORT> and C<T_U_LONG> XS


johannessen · 2025-10-07T12:56:51Z

dist/ExtUtils-ParseXS/lib/perlxs.pod

+=head3 Fully-qualified type names and Perl objects
+
+    Foo::Bar
+    foo(Foo::Bar self, ...)
+
+Normally the type of an XUB's parameter or return value is a valid C type,
+such as C<"char *">. However you can also use Perl package names. When a


What happens when such a parameter receives a Perl object blessed to a different package? Is @ISA a factor at all?

I don’t know the answer, but feel like this should be addressed (briefly) somewhere in this document. For example the section The OVERLOAD: Keyword makes it sound like there is an automatic type check, but it remains unclear whether it's just for the svtype or if the package is considered in some way:

will croak with an Expected foo to be of type My::Num, got scalar error

It depends on the typemap, the commonly used T_PTROBJ follows @ISA.

OVERLOAD just does use overload 'op' => \&xs_sub.

Ah, of course. Yes. Thank you for reminding me.

I suggest adding a direct link for the T_PTROBJ documentation, for example like this:

@@ -4135,10 +4135,10 @@ argument, until finally some sort of destroy function frees the handle and its data. The C<T_PTROBJ> typemap is one common method for mapping Perl -objects to such C library handles. Behind the scenes, it uses blessed -scalar objects with the scalar's integer value set to the address of the -handle. The C<INPUT> code template of the C<T_PTROBJ> typemap retrieves the -pointer from the scalar object referred to by a passed RV argument, while -the C<OUTPUT> template creates a new blessed RV-to-SV with the handle -address stored in it. +objects to such C library handles; see L<perlxstypemap/T_PTROBJ>. Behind the +scenes, it uses blessed scalar objects with the scalar's integer value set +to the address of the handle. The C<INPUT> code template of the C<T_PTROBJ> +typemap runs type checks and retrieves the pointer from the scalar object +referred to by a passed RV argument, while the C<OUTPUT> template creates a +new blessed RV-to-SV with the handle address stored in it. For the purposes of an example, we'll create here a minimal example C

Leont · 2025-10-08T23:52:56Z

dist/ExtUtils-ParseXS/lib/perlxs.pod

+
+    short
+    baz(int a, char *b = "")
+      PREINIT:


Should we still use PREINIT? I see it's use-case in C89 for declarations, but in C99 it can be confusing: mainly because it runs before argument handling.

Leont · 2025-10-08T23:55:50Z

dist/ExtUtils-ParseXS/lib/perlxs.pod

+completely within the signature (i.e. which don't use an C<INPUT> section
+to specify their type).
+
+Perls before 5.36 used C89 compiler semantics, which didn't allow variable


I don't think this is entirely true. We used C89 it for perl itself, but not necessarily for CPAN modules.

iabyn · 2025-10-09T06:50:21Z

On Wed, Oct 08, 2025 at 04:53:18PM -0700, Leon Timmermans wrote: Should we still use PREINIT? I see it's use-case in C89, but in C99 it can be confusing (mainly because it runs before argument handling).

I didn't look too closely, but my quick experimentation seemed to indicate that the flag(s) telling perl to use C99 weren't necessarily being propagated to XS compilation. So I was playing safe. Also if people want to write code that runs on older perls, or have any reason to declare and possibly initialise a var before argument processing, then PREINIT should be the official way.

tonycoz · 2025-10-09T21:39:54Z

Yes, the -std flag is set by the cflags script and not in $Config{ccflags}.

Some systems (I think non-x86 BSDs) can use old gccs that don't default to c99 or later.

iabyn · 2025-10-12T07:41:14Z

On Sat, Oct 11, 2025 at 03:00:31PM -0700, Leon Timmermans wrote: @Leont commented on this pull request. > + require XSLoader; + XSLoader::load(__PACKAGE__, $VERSION); That automatically adds the module name but not the expected version. Personally I wouldn't use it..

I was just following the form of the Foo.pm which h2xs generates. I have no knowledge or strong opinions about the usage of XSLoader.

wolfsage · 2025-10-17T00:35:31Z

dist/ExtUtils-ParseXS/lib/perlxs.pod

+
+    /* For efficiency, always define PERL_NO_GET_CONTEXT: not enabled by
+     * default for backwards compatibility. For details, see
+     * L<perlguts|perlguts/


Is this intended to be a link to other documentation? Because this is in a code block, it won't be treated that way by pod parsers

iabyn · 2025-10-17T08:54:49Z

On Thu, Oct 16, 2025 at 05:44:45PM -0700, Matthew Horsfall (alh) wrote: @wolfsage commented on this pull request. Is this intended to be a link to other documentation? Because this is in a code block, it won't be treated that way by pod parsers

Yes, it was originally intended to link. Then I realised that (as you point out) it doesn't link, but I forgot to update it.

tonycoz · 2025-10-20T03:45:39Z

dist/ExtUtils-ParseXS/lib/perlxs.pod

+    int
+    X::Y::new(int i)


Shouldn't new() return an X::Y *?

tonycoz · 2025-10-21T02:57:50Z

dist/ExtUtils-ParseXS/lib/perlxs.pod

+=head3 Perl OPs
+
+An C<OP> is is a data structure within the perl interpreter. It is used to
+hold the nodes within a tree structure created when the perl source is
+compiled. It usually represents a single operation within the perl source,
+such as an add, or a function call. The structure has various flags and
+data, and a pointer to a C function (called the PP function) which is used
+to implement the actions of that OP. The main loop of the perl interpreter
+consists of calling the PP function associated with the current OP
+(C<PL_op>) and then updating it, typically to C<< PL_op->op_next >>. In
+particular, the C<OP_ENTERSUB> performs (or at least starts) a function
+call.


99%* of XS code doesn't touch OPs, I don't think it needs to be the first heading, or even mentioned at all here.

* I made this number up

tonycoz · 2025-10-21T03:34:41Z

dist/ExtUtils-ParseXS/lib/perlxs.pod

+
+=head3 The SV scalar value structure
+
+As mentioned above, almost runtime data within the perl interpreter is


I'm not sure it's good to go into so much detail on the internal structure of SVs here.

I expect nearly all XS code is fine with newSViv()/sv_setiv()/SvIV() etc rather than looking at any flags beyond SvOK() and SvUTF8().

I think this entire section on SVs is put together well and parts of it are quite important. It’s not obvious to me what could be cut. Could you be more specific?

The flags in particular might not be useful information for everybody, but explaining the general principle of how an SV works I feel is crucial, and the flags are a big part of that. FWIW, reading that part made me recognise two bugs in one of my modules. I think this section is fine.

tonycoz · 2025-10-21T03:36:26Z

I'm done for now.

iabyn · 2025-10-21T08:34:26Z

On Mon, Oct 20, 2025 at 07:58:13PM -0700, Tony Cook wrote: > +=head3 Perl OPs

...

99%* of XS code doesn't touch OPs, I don't think it needs to be the first heading, or even mentioned at all here.

I mentioned OPs mainly as a way of introducing OP_ENTERSUB, which I needed to do to explain how XSUBs get called. Also, I wanted to give readers a bit of background knowledge abut how the interpreter works, to make sense of things when debugging. I'll think about this some more.

iabyn · 2025-10-21T08:37:56Z

On Mon, Oct 20, 2025 at 08:35:05PM -0700, Tony Cook wrote: I'm not sure it's good to go into so much detail on the internal structure of SVs here. I expect nearly all XS code is fine with newSViv()/sv_setiv()/SvIV() etc rather than looking at any flags beyond SvOK() and SvUTF8().

I wanted to educate people into the difference between SvIVX() and SvIV(), for example. I've emphasised that they should generally use the latter, but at least now they're aware of the difference when they then try and cargo-cult something from an existing distro that incorrectly uses SvIVX().

wolfsage

(more to come, don't want this to get lost)

wolfsage · 2025-10-22T22:37:51Z

dist/ExtUtils-ParseXS/lib/perlxs.pod

+
+Note that much of the L<data passing overview|/Overview of how data is
+passed to and from an XSUB> part of this document is just a summary of
+the parts of of perlguts which are most relevant to writing XS code,


nit: s/of of/of/

wolfsage · 2025-10-22T22:56:30Z

dist/ExtUtils-ParseXS/lib/perlxs.pod

+them it should be done in a way which doesn't increase its reference
+count. For example, this modifies C<sv> to be a reference to a
+newly-created SV holding an integer value, i.e. the perl equivalent of
+C<$sv =\99>:


nit: A space after the '=' would be good so it looks like $sv = \99

wolfsage · 2025-10-22T22:57:30Z

dist/ExtUtils-ParseXS/lib/perlxs.pod

+reference count to it), then if the code croaks, the SV on the stack will
+leak. To avoid this, there is a separate I<temps> stack in the Perl
+interpreter. Items on this stack I<are> reference counted. Typically the
+temps stack is reset at start of each statement, back to some particular


nit: at start -> at the start

wolfsage · 2025-10-22T23:00:00Z

dist/ExtUtils-ParseXS/lib/perlxs.pod

+    SV *sv_abc = newSVpvn_flags("abc", 3, SVs_TEMP);
+
+Many OPs have an SV attached to them called a I<PADTMP>. This SV has a
+lifetime which is the same as the sub which the OP is a part of, and


"This SV has a lifetime which is the same as the sub which the OP is a part of..."

This feels like "As long as the sub is defined..." but I think this means "the duration of the subroutine call"?

wolfsage · 2025-10-22T23:01:39Z

dist/ExtUtils-ParseXS/lib/perlxs.pod

+
+=head3 The SV scalar value structure
+
+As mentioned above, almost runtime data within the perl interpreter is


nit: almost runtime -> almost all runtime

wolfsage · 2025-10-22T23:31:19Z

dist/ExtUtils-ParseXS/lib/perlxs.pod

+However, if the SV comes from elsewhere, for example via a Perl array
+lookup, then its reference count doesn't need to be adjusted, and so the
+mortalising will cause it to be prematurely freed. In this case, you need
+to artificially increase the SV's reference count.


Could be nice to say "...artificially increase the SV's reference count using SvREFCNT_inc as seen blow"

wolfsage · 2025-10-22T23:33:44Z

dist/ExtUtils-ParseXS/lib/perlxs.pod

+
+Sometimes you want to return a non-scalar SV, such as an AV, HV or CV.
+However, these aren't allowed directly on the argument stack. You are
+supposed instead to return a I<reference> to the AV: a bit like a Perl sub


nit: supposed instead to -> supposed to instead

wolfsage · 2025-10-22T23:37:00Z

dist/ExtUtils-ParseXS/lib/perlxs.pod

+which just returns a scalar). In this case you have to tell the C compiler
+that C<SVREF> is just another name for C<SV*>:
+
+    typedef SV *SVREF;


That's weird. Why is this necessary (when no other types have to be declared in this way)

wolfsage · 2025-10-22T23:43:01Z

dist/ExtUtils-ParseXS/lib/perlxs.pod

+the input typemap entry for C<AV*> automatically takes care of
+dereferencing the argument and croaking if it's not actually a reference.
+The C<PUSHs()> macro simply pushes an SV onto the stack, without any
+mortalising or copying. Any "holes" in the array are filled with undefs.


I'm mildly surprised the above section didn't touch on the XPUSH* macros that extend the stack for you

wolfsage · 2025-10-22T23:44:34Z

dist/ExtUtils-ParseXS/lib/perlxs.pod

+    XSLoader::load(__PACKAGE__, $VERSION);
+
+This causes the F<Bar.so> or F<Bar.dll> file to be dynamically linked in
+and then the C<boot_Foo__Bar()> function called. This boilerplate code is


nit: function called -> function to be called

wolfsage

Alright that's enough for one night, more later

wolfsage · 2025-10-23T00:10:49Z

dist/ExtUtils-ParseXS/lib/perlxs.pod

+use C<TYPEMAP:> to individually override specific C<TYPEMAP>, C<INPUT>, or
+C<OUTPUT> entries in the system typemap. In general, typemap changes
+affect any subsequent XSUBs within the file, until further updates.  Note
+however that due a quirk in parsing, it is possible for a C<TYPEMAP:>


nit: due a quirk -> due to a quirk

wolfsage · 2025-10-23T00:11:48Z

dist/ExtUtils-ParseXS/lib/perlxs.pod

+C<OUTPUT> entries in the system typemap. In general, typemap changes
+affect any subsequent XSUBs within the file, until further updates.  Note
+however that due a quirk in parsing, it is possible for a C<TYPEMAP:>
+entry immediately I<after> an XSUB to affect that XSUB.


Wow! Can we provide more info here or a "how to avoid this"? Otherwise I wouldn't risk using it...

wolfsage · 2025-10-23T00:12:42Z

dist/ExtUtils-ParseXS/lib/perlxs.pod

+
+=back
+
+The most recently-applied entries take precedence, so for example you can


nit: recently-applied -> recently applied

wolfsage · 2025-10-23T00:21:34Z

dist/ExtUtils-ParseXS/lib/perlxs.pod

+      CODE:
+        ...
+
+Generally there no reason to use the old style any more, apart from a few


nit: there is no reason

wolfsage · 2025-10-23T00:46:41Z

dist/ExtUtils-ParseXS/lib/perlxs.pod

+the keyword). For the first, it may appear anywhere in the input part or
+the XSUB. For the latter, it may appear anywhere in file scope, but due to
+a long-standing parser bug, the keyword's state is reset at the start of
+each XSUB, so it will only have any effect if appears just before a XSUB


nit: if appears -> if it appears

wolfsage · 2025-10-23T00:50:00Z

dist/ExtUtils-ParseXS/lib/perlxs.pod

-semicolon is allowed after the argument list, as in
+Only one of these keywords may appear in this part, and at most once; and
+no other keywords are recognised in this part (although such keywords
+could be instead be processed in the tail or head of the preceding and


nit: could be instead be -> could instead be?

wolfsage · 2025-10-23T00:51:06Z

dist/ExtUtils-ParseXS/lib/perlxs.pod

-          XSRETURN(1);
+=head3 Auto-calling a C function
+
+In the absence of any explicit main body code via C<CODE> or C<PPCODE>,


or NOT_IMPLEMENTED_YET !

iabyn added 30 commits October 2, 2025 12:03

perlxs.pod: reindent and reformat code examples

8f8f7f0

The various XS code examples had odd and inconsistent indentation (often with 5 leading spaces) and inconsistent formatting, e.g. foo(a,b) vs foo( a, b ) vs foo(a, b). Fix that, and also remove any tab chars. Whitespace-only change.

perlxs.pod: fix up links after deleting sections

305e19c

The previous commit deleted several sections from perlxs.pod. This commit fixes things up; done as a separate commit so that the changes aren't drowned out in the diff listing.

perlxs.pod: add BNF definition section

ca7bdbd

Add a section which semi-formally tries to define the syntax and structue of an XS file, using a BNF-like format. See http://nntp.perl.org/group/perl.perl5.porters/268701 for the discussion of this part.

perlxs.pod: update MODULE/PACKAGE/PREFIX

f596116

Rewrite the POD for these three keywords, and in particular, treat them as one declaration, rather than three unrelated keywords.

perlxs.pod: update REQUIRE, VERSIONCHECK keywords

df51936

perlxs.pod: update PROTOTYPES: keyword

0312e37

perlxs.pod: update EXPORT_XSUB_SYMBOLS, INCLUDE(_COMMAND)

ac07e97

perlxs.pod: update TYPEMAP: keyword

09e774d

perlxs.pod: update BOOT: keyword

560391d

perlxs.pod: update FALLBACK: keyword

9af5c74

perlxs.pod: update XSUB Structure + Declaration

e5346fd

Populate the new =head2 The Structure of an XSUB =head2 An XSUB Declaration sections

perlxs.pod: update section 'An XSUB Parameter'

955ad90

Add some initial text for this new section, and also add a new subsection "XSUB Parameter Placeholders".

perlxs.pod: update IN_OUT etc section

9717f25

perlxs.pod: update default, length, ellipis params

d8dcfe7

Rewrite (and retitle) these three subsections: =head3 Default Parameter Values =head3 The C<length(NAME)> Keyword =head3 Variable-length Parameter Lists

perlxs.pod: update: Input part, PREINIT sections

34f73e1

Add text for the new '=head2 The XSUB Input Part' section, and rewrite the existing entry for the PREINIT keyword.

perlxs.pod: update 'SCOPE: Keyword' section

fd4933d

perlxs.pod: update: init part, INIT sections

7fd59c0

Add text for the new '=head2 The XSUB Init Part' section, and rewrite the existing entry for the INIT keyword.

perlxs.pod: update: code part, autocall, C_ARGS

8e0e4de

Add text to the new =head2 The XSUB Code Part =head3 Auto-calling a C function sections, and rewrite the existing =head4 The C_ARGS: Keyword section

perlxs.pod: update: CODE, PPCODE

4e8a05c

Rewrite these sections: =head3 The CODE: Keyword =head3 The PPCODE: Keyword

perlxs.pod: update NOT_IMPLEMENTED_YET: keyword

5ab6a04

This keyword formerly wasn't documented. The docs now say "this is what it is, but don't use it".

perlxs.pod: update: output part

6264856

Add text to the new =head2 The XSUB Output Part section, and rewrite the text in these existing sections: =head3 The POSTCALL: Keyword =head3 The OUTPUT: Keyword

perlxs.pod: update: cleanup part

08a1224

Add text to the new =head2 The XSUB Cleanup Part section, and rewrite the text in this existing section: =head3 The CLEANUP: Keyword

perlxs.pod: update generic intro, PROTOTYPE

cc81622

Add text to the new =head2 XSUB Generic Keywords section, and rewrite the text in this existing section: =head3 The PROTOTYPE: Keyword

iabyn added 10 commits October 2, 2025 12:03

perlxs.pod: document ATTRS

4d48858

This keyword was undocumented, even though it had been added 25 years ago.

perlxs.pod: add "Sharing XSUB bodies" section

f66fd49

Populate the introduction to this new section.

perlxs.pod: update: ALIAS

815fe4e

Rewrite this section: =head3 The ALIAS: Keyword

perlxs.pod: update INTERFACE, INTERFACE_MACRO

7188b4a

Rewrite these sections: =head3 The INTERFACE: Keyword =head3 The INTERFACE_MACRO: Keyword also demote the second to be a head4 child of the first. Then expand the T_PTROBJ example to use INTERFACE as an alternative to ALIAS.

perlxs.pod: update: CASE

17cf168

Rewrite this section: =head3 The CASE: Keyword

perlxs.pod: update MY_CXT section

ae9183e

Revise the text in this section: =head2 Safely Storing Static Data in XS

perlxs.pod: update EXAMPLES section

f3f436a

Rewrite this section: =head1 EXAMPLES Basically, delete the one big example in this section and instead provide links to various other examples already present in this document instead.

perlxs.pod: update CAVEATS, AUTHOR, A DIAGNOSTICS

9afbfda

Tweak the final few sections of perlxs.pod.

tonycoz reviewed Oct 6, 2025

View reviewed changes

johannessen suggested changes Oct 7, 2025

View reviewed changes

johannessen mentioned this pull request Oct 7, 2025

Fix 0.27 test failures in 002_icpp.t tsee/extutils-cppguess#33

Open

Leont reviewed Oct 8, 2025

View reviewed changes

wolfsage reviewed Oct 17, 2025

View reviewed changes

tonycoz reviewed Oct 20, 2025

View reviewed changes

dist/ExtUtils-ParseXS/lib/perlxs.pod

Comment on lines +4296 to +4297

int

X::Y::new(int i)

Copy link

Contributor

tonycoz Oct 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't new() return an X::Y *?

tonycoz reviewed Oct 21, 2025

View reviewed changes

wolfsage reviewed Oct 22, 2025

View reviewed changes

wolfsage reviewed Oct 23, 2025

View reviewed changes

		If you need to coerce an SV to a string (e.g. before modifying its string
		value) then use C<SvPV_force()> or one of its variants. For example if

		C<ALIAS> which is more suited for autocall. Note that C<ALIAS> should not
		be used together with either of C<INTERFACE> or C<ATTRS>.

		are converted back and forth via C<(UV)> casts. A few unsigned types such
		as C<I16> and C<U32> are instead mapped to C<T_U_SHORT> and C<T_U_LONG> XS


		=head3 The SV scalar value structure

		As mentioned above, almost runtime data within the perl interpreter is


		=back

		The most recently-applied entries take precedence, so for example you can

Davem/xspod #23795

Are you sure you want to change the base?

Davem/xspod #23795

Conversation

iabyn commented Oct 2, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

johannessen left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Leont Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

iabyn commented Oct 9, 2025 via email

Uh oh!

tonycoz commented Oct 9, 2025

Uh oh!

iabyn commented Oct 12, 2025 via email

Uh oh!

Choose a reason for hiding this comment

Uh oh!

iabyn commented Oct 17, 2025 via email

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tonycoz commented Oct 21, 2025

Uh oh!

iabyn commented Oct 21, 2025 via email

Leont Oct 8, 2025 •

edited

Loading